taylor series
On Convergence of Polynomial Approximations to the Gaussian Mixture Entropy
Gaussian mixture models (GMMs) are fundamental to machine learning due to their flexibility as approximating densities. However, uncertainty quantification of GMMs remains a challenge as differential entropy lacks a closed form. This paper explores polynomial approximations, specifically Taylor and Legendre, to the GMM entropy from a theoretical and practical perspective. We provide new analysis of a widely used approach due to Huber et al. (2008) and show that the series diverges under simple conditions. Motivated by this divergence we provide a novel Taylor series that is provably convergent to the true entropy of any GMM.
Symmetric Behavior Regularized Policy Optimization
Zhu, Lingwei, Shah, Haseeb, Chen, Zheng, Nagai, Yukie, White, Martha
Behavior Regularized Policy Optimization (BRPO) leverages asymmetric (divergence) regularization to mitigate the distribution shift in offline Reinforcement Learning. This paper is the first to study the open question of symmetric regularization. We show that symmetric regularization does not permit an analytic optimal policy $ฯ^*$, posing a challenge to practical utility of symmetric BRPO. We approximate $ฯ^*$ by the Taylor series of Pearson-Vajda $ฯ^n$ divergences and show that an analytic policy expression exists only when the series is capped at $n=5$. To compute the solution in a numerically stable manner, we propose to Taylor expand the conditional symmetry term of the symmetric divergence loss, leading to a novel algorithm: Symmetric $f$-Actor Critic (S$f$-AC). S$f$-AC achieves consistently strong results across various D4RL MuJoCo tasks. Additionally, S$f$-AC avoids per-environment failures observed in IQL, SQL, XQL and AWAC, opening up possibilities for more diverse and effective regularization choices for offline RL.
From Taylor Series to Fourier Synthesis: The Periodic Linear Unit
The dominant paradigm in modern neural networks relies on simple, monotonically-increasing activation functions like ReLU. While effective, this paradigm necessitates large, massively-parameterized models to approximate complex functions. In this paper, we introduce the Periodic Linear Unit (PLU), a learnable sine-wave based activation with periodic non-monotonicity. PLU is designed for maximum expressive power and numerical stability, achieved through its formulation and a paired innovation we term Repulsive Reparameterization, which prevents the activation from collapsing into a non-expressive linear function. We demonstrate that a minimal MLP with only two PLU neurons can solve the spiral classification task, a feat impossible for equivalent networks using standard activations. This suggests a paradigm shift from networks as piecewise Taylor-like approximators to powerful Fourier-like function synthesizers, achieving exponential gains in parameter efficiency by placing intelligence in the neuron itself.